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The  above-referenced  project  commenced  on  May  1  2006,  with  project  termination 
date  initially  set  as  April  30  2009,  and  total  cost  of  $336638  (including  indirect 
cost).  On  May  7  2007,  a  one-time  supplement  fund  of  $75000  was  awarded.  On  Oct  8 
2007,  upon  the  request  of  the  University  of  Michigan,  PI  was  changed  to  Prof. 
William  Gehring  of  the  Department  of  Psychology,  University  of  Michigan.  On  April 
9,  2011,  upon  the  request  of  the  University  of  Michigan,  PI  was  change  back  to  Prof. 
Jun  Zhang,  the  original  PI.  No-cost-extension  (NCE)  applications  were  filed  and  were 
approved  so  that  grant  termination  date  was  eventually  set  to  March  30  2012.  The 
project  concluded  as  of  March  30  2012. 

The  project  was  about  dynamic  decision-making  under  time  pressure,  when  one  is 
faced  with  the  tradeoff  between  the  benefit  of  improving  decision  accuracy 
associated  with  continued  observation  of  the  environment  and  the  cost  for  making 
additional  observations  and/or  for  delaying  a  decision.  The  modeling  framework 
adopted  was  Bayesian  sequential  analysis,  with  random-walk/drift-diffusion  for 
evidence  accumulation  process,  and  optimal  combination  of  bottom-up  evidence 
accumulation  (likelihood  function)  with  top-down  contextual  knowledge  (prior 
expectation)  as  the  stream  of  data  flows  in.  Such  analysis  were  to  apply  to  Automatic 
Target  Recognition  (ATR)  systems  which  acquire  a  sequence  of  images  are  where 
potential  targets  (aircraft,  tank,  etc)  are  embedded  but  not  immediately  and/or 
obviously  detectable.  The  project  was  a  collaboration  between  the  PI  at  the 
University  of  Michigan  and  Dr.  Daniel  Repperger  at  AFRL  Wright-Patterson  Air 
Force  Base. 

During  the  execution  of  the  project,  two  unexpected  events  occurred  that  have 
significant  impacted  the  original  plan.  First,  the  original  PI,  Dr.  Jun  Zhang,  has  been 
on  an  IPA  assignment  to  AFOSR  from  Sept  1,  2007,  initially  for  two  years  but  later 
extended  until  Jan  10  of  2011.  He  relocated  to  Arlington  VA  and  served  as  the 
Program  Manager  of  the  Mathematical  Modeling  of  Cognition  and  Decision  Program 
at  the  Directorate  of  Mathematics,  Information  and  Life  Sciences.  During  his  absence 
from  the  University  of  Michigan,  Dr.  Gehring  served  as  the  Project  Director  while  Dr. 
Zhang  (while  not  deriving  any  salary  or  fringe  benefits  from  the  grant)  maintained 


his  supervision  of  students  for  carrying  out  research  activities  under  this  project. 
Secondly,  Dr.  Repperger  of  AFRL  suffered  from  a  heart  attack  and  died  on  Jan  10, 
2010.  The  untimely  passing  of  Dr.  Repperger,  an  AFRL  Fellow,  was  a  tremendous 
loss  to  the  Lab  and  negatively  impacted  the  project,  as  the  ATL  portion  of  the  project 
was  a  collaborative  effort. 

Despite  of  these  adverse  events,  we  were  still  able  to  make  significant  progress  in 
the  following  four  areas  for  understanding  decision  making  and  its  neural 
mechanisms:  1)  the  analysis  of  event-related  potential  (ERP)  components  related  to 
stimulus,  response,  and  the  decision  process  that  link  the  two;  2)  the  analysis  of 
neuronal  activities  related  to  decision  making;  3)  the  modeling  of  motivational  force 
("incentive  salience”)  for  decision  making;  4)  strategic  reasoning  and  decision 
making  in  games.  A  detailed  report  of  research  findings  will  be  given  below. 
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Summary  of  Research  Findings: 

There  have  been  four  main  areas  of  research  that  were  conducted  under  the  support 
of  this  grant. 

1)  Decomposing  ERP  components  related  to  stimulus,  response  and  the  decision  that 
links  the  two  (Publication  #5,  #6,  #10,  #12) 

Event-related  potentials  (ERPs)  reflect  the  brain  activities  related  to  specific 
behavioral  events,  and  are  obtained  by  averaging  across  many  trial  repetitions  with 
individual  trials  aligned  to  the  onset  of  a  specific  event,  e.g.,  the  onset  of  stimulus  (s- 
aligned)  or  the  onset  of  the  behavioral  response  (r-aligned).  Examples  of  the  former 
included  P300  and  N400  components,  and  examples  of  the  latter  include  error- 
related  negativity  (ERN)  and  lateralized  readiness  potential  (LRP).  However,  the  s- 
aligned  and  r-aligned  ERP  waveforms  do  not  purely  reflect,  respectively,  underlying 
stimulus  (S-)  or  response  (R-)  component  waveform,  due  to  their  cross¬ 
contaminations  in  the  recorded  ERP  waveforms.  Earlier,  Zhang  [J.  Neurosci. 

Methods,  80,  pp.  49-63, 1998]  proposed  an  algorithm  to  recover  the  pure  S- 
component  wave-  form  and  the  pure  R-component  waveform  from  the  s-aligned  and 


r-aligned  ERP  average  waveforms — however,  due  to  the  nature  of  this  inverse 
problem,  a  direct  solution  is  sensitive  to  noise  that  dis-  proportionally  affects  low- 
frequency  components,  hindering  the  practical  implementation  of  this  algorithm. 
During  the  grant  period,  working  with  an  exchange  student  Gang  Yin,  we  apply  the 
Wiener  deconvolution  technique  to  deal  with  noise  in  input  data,  and  investigate  a 
Tikhonov  regularization  approach  to  obtain  a  stable  solution  that  is  robust  against 
variances  in  the  sampling  of  reaction-time  distribution  (when  number  of  trials  is 
low).  This  method  is  demonstrated  using  data  from  a  Go/NoGo  experiment  about 
image  classification  and  recognition. 

Our  method  was  applied  to  a  study  of  patients  with  obsessive-compulsive  disorder 
(OCD)  about  whom  the  literature  said  there  was  an  increased  error-related 
negativity  (ERN).  In  that  study,  with  medication  use  properly  controlled,  we  found 
greater  ERNs  in  OCD  patients  than  in  controls,  irrespective  of  medication  use, 
suggesting  that  elevated  error  signals  in  OCD  may  be  disorder-specific. 

Finally,  we  were  able  to  extend  our  analytic  tools  to  deal  with  three  or  more 
markers  in  a  single  trial,  and  recover  individual  ERP  components  that  are  time- 
locked  to  those  markers.  As  an  application,  we  analyzed  a  cuing  experiment  with 
three  events:  cue,  stimulus  and  response.  The  elapse  between  cue  and  stimulus  was 
varied  from  trial  to  trial  by  the  experimenter,  and  the  time  between  stimulus  and 
response  was  determined  by  the  subjects  (reaction-time  variation).  Our 
decomposition  results  show  that  the  cue-  dependent  component  waveform  turns 
out  to  flatten  out  500ms  after  cue-onset,  a  finding  consistent  with  our  experimental 
paradigm. 

The  suite  of  methods  we  developed  under  the  grant  support  actually  addresses  and 
solves  a  problem  when  other  traditional  methods  may  fail  (i.e.,  when  event  time 
distributions  are  large),  so  it  is  a  completely  complementary  technique  in  ERP  multi- 
component  analysis.  Though  our  simulations  are  based  on  ERP  context,  the  basic 
mathematical  technique  behind  our  method  can  also  be  easily  adapted  to  deal  with 
event-related  signals  in  other  neuro-imaging  studies  (e.g.  fMRI),  such  that  trial-by- 
trial  variation  in  behavioral  reaction  time  is  no  longer  an  obstacle  but  rather  an 
opportunity  for  isolating  the  underlying  neurocognitive  processes  mediating  a  task. 


2)  Analyzing  neuronal  activity  related  to  perceptual  decision  making  (Publication  #7) 

Random-walk/drift-diffusion  models  have  in  recent  years  been  used  to  model 
neural  basis  for  decision  making.  Neurophysiological  data  provided  support  that 
neurons  in  certain  brain  areas  (such  as  MT  and  LIP)  were  responsible  for  the 
perceptual  decision  of  an  animal  (monkey)  in  visual  discrimination  task.  However, 
the  exact  role  of  each  individual  recorded  neuron,  namely  sensory,  motor,  or 
sensorimotor  transformation  ("decision"  to  translate  from  sensory  representation 
to  motor  representation)  is  completely  clear.  In  our  study  performed  under  the 
grant,  we  applied  novel  mathematical  techniques  to  analyze  a  published  dataset, 


published  in  2002  by  Roitman  and  Shadlen,  who  showed  in  a  random-dot  motion- 
discrimination  paradigm  showed  that  information  accumulation  model  with  a 
threshold-crossing  mechanism  can  account  for  activity  of  the  lateral  intraparietal 
area  (LIP)  neurons.  Our  specific  question  was  to  quantitatively  address  the  sensory 
versus  motor  representation  of  the  neuronal  activity  during  the  time  course  of  a 
trial.  A  technique  based  on  Signal  Detection  Theory  was  applied  to  provide  indices 
to  quantify  how  neuronal  firing  activity  is  responsible  for  encoding  the  stimulus  or 
selecting  the  response  at  the  behavioral  level.  Additionally,  a  statistical  model  based 
on  Poisson  regression  was  used  to  provide  an  orthogonal  decomposition  of  the 
neural  activity  into  stimulus,  response,  and  stimulus-response  mapping 
components.  The  temporal  dynamics  of  the  sensorimotor  locus  of  the  LIP  activity 
indicated  that  there  is  no  stimulus-response  mapping-specific  neuronal  firing 
activity  throughout  a  trial;  the  neural  activity  toward  the  saccadic  onset  reflects  the 
development  of  the  motor  representation,  and  the  neural  activity  in  the  beginning  of 
a  trial  contains  little,  if  any,  information  about  the  sensory  representation. 
Sensorimotor  analysis  on  individual  neurons  also  showed  that  the  neuronal 
activation,  as  a  population,  represent  pending  saccadic  direction  and  carry  little 
information  about  the  direction  of  the  motion  stimulus. 

Our  technical  innovation  allowed  us  to  analyze  the  information  accumulation 
process  in  the  LIP  activity  and  examined  the  sensorimotor  nature  of  the 
representation  of  information  encoded  by  recorded  neurons  on  a  trial-by-trial  basis. 
Our  SDT-based  analysis  provides  a  quantitative  measure  of  sensorimotor  locus  of 
the  neuron  at  each  time  point,  and  the  Poisson  regression  model  incorporating 
orthogonal  decomposition  of  neuronal  activity  provides  a  quantitative  assessment 
as  to  how  the  stimulus,  response,  and  stimulus-response  mapping  components 
contribute  to  the  spike  activity. 


3)  Modeling  motivational  effects  or  "incentive  salience"  in  decision  making 
(Publication  #2,  #4,  #8) 

Motivational  impact  on  decision  making  has  been  widely  modeled  using 
reinforcement  learning  paradigm.  Incentive  salience  is  a  motivational  magnet 
property  attributed  to  reward  predicting  conditioned  stimuli  (cues).  This  property 
makes  the  cue  and  its  associated  unconditioned  reward  'wanted'  at  that  moment, 
and  pulls  an  individual’s  behavior  towards  those  stimuli.  The  incentive-sensitization 
theory  of  K.  Berridge  and  T.  Robinson,  which  was  initially  proposed  in  the  drug 
addiction  context,  posits  that  permanent  changes  in  brain  mesolimbic  systems  in 
drug  addicts  can  amplify  the  incentive  salience  of  Pavlovian  drug  cues  to  produce 
excessive  "wanting"  to  take  drugs.  Collaborating  with  Berridge  and  colleagues,  we 
built  a  computational  model  of  incentive  salience  to  captures  motivational  impact 
on  reward  learning,  and  contrast  it  to  traditional  cache-based  models  of 
reinforcement  learning.  Our  motivation-based  model  incorporates  dynamically 
modulated  physiological  brain  states  that  change  the  ability  of  cues  to  elicit 
"wanting”  on  the  fly.  These  presumed  brain  states  include  the  presence  of  a  drug  of 


abuse  and  longer-term  mesolimbic  sensitization,  both  of  which  boost 
mesocorticolimbic  cue-triggered  signals.  We  have  tested  our  model  using  recorded 
neuronal  activity  from  mesolimbic  output  signals  for  reward  and  Pavlovian  cues  in 
the  ventral  pallidum  (VP),  and  a  novel  technique  for  analyzing  neuronal  firing 
"profile”,  presents  evidence  in  support  of  our  dynamic  motivational  account  of 
incentive  salience. 


4)  Modeling  and  empirical  study  of  strategic  reasoning  for  decision  making  in  games 
(Publication  #3  and  #11) 

Strategic  interpersonal  interaction,  as  modeled  by  mathematical  game  theory, 
involves  rational  players  weighing  their  choice  of  actions  through  analyzing  player- 
specific  payoffs  associated  with  outcomes  that  are  jointly  determined  by  their  own 
and  their  opponents’  choices  (Von  Neumann  &  Morgenstern,  1944;  Luce  &  Raiffa, 
1957).  This  is  decision  making  under  uncertainty  in  social  domain.  Traditional  game 
theoretic  approaches  assume  that  players  take  full  advantage  of  common  knowledge 
and  rationality  (CKR)  in  games  of  complete  information  (Binmore,  1992;  Osborne  & 
Rubinstein,  1994).  Common  knowledge  is  said  to  exist  when  all  players  know 
something  to  be  true,  know  that  all  players  know  it  to  be  true,  know  that  all  players 
know  all  players  know  it  to  be  true,  and  so  on.  Normative  (equilibria)  solutions  of 
games  require  recursive  modeling  of  other  players  to  its  full  depth,  leading  to  the 
framework  of  epistemic  game  theory  (Mertens  &  Zamir,  1985;  Brandenburger  & 
Dekel,  1993).  Such  recursive  modeling  is  manifested  in  developmental  psychology 
as  the  so-called  "Theory-of-Mind"  (ToM)  reasoning.  In  this  study,  we  manipulated 
participants’  perspectives  in  games  in  order  to  differentiate  ToM-based  recursive 
reasoning  from  the  confounding  factor  of  decision  horizon  in  look-ahead  planning 
or  backward  induction  in  multistage  games. 

In  a  two-person,  three-stage  board  game  we  designed,  players  take  turns  in 
controlling  the  progression  or  termination  of  the  game  (from  Cell  A  to  B  and  to  C).  In 
predicting  Player  IPs  optimal  choice  at  Cell  B,  participants  adopt  a  first-person 
perspective  (1PP,  “planning”)  when  assigned  the  role  of  Player  II,  or  a  third-person 
perspective  (3PP,  "anticipation”)  when  assigned  the  role  of  Player  I.  The  need  for 
sequential  planning  is  equivalent  for  both  assignments — the  payoff  comparisons 
involved  are  formally  identical  and  require  the  same  working  memory  load. 
However,  we  showed  a  clear  advantage  for  1PP  over  3PP  in  achieving  predictive 
reasoning  (i.e.,  in  considering  Player  I’s  countermove  upon  arriving  in  Cell  C). 
Although  most  participants  in  1PP  and  3PP  began  with  a  myopic  ToM  strategy, 
those  in  1PP  were  more  likely  to  eventually  acquire  predictive  ToM  reasoning. 
Participants  in  3PP  are  placed  farther  up  in  the  analysis  stream  (compared 
with  1PP),  with  the  corresponding  disadvantage  of  having  to  process  one  more 
level  of  ToM  recursion.  This  suggests  that  we  are  more  ready  to  anticipate  others’ 
reactions  to  an  action  we  plan  than  to  accommodate  that  others,  when  planning 
their  action,  may  have  already  taken  into  account  possible  counter-reactions  from 
ourselves.  Our  study  delineated  "instrumental  rationality”  in  decision  making  (the 


ability  to  rationally  choose  optimal  actions  given  a  belief-desire  state)  from 
“inductive  rationality”  (the  ability  to  establish  the  most  predictive  model  of  the 
opponent)  in  strategic  reasoning. 

In  a  separate  study,  we  applied  such  recursive  ToM  reasoning  ("I  think  you  think  I 
think . . . ")  to  normative  solution  to  games,  known  as  the  meta-game  analysis 
(Howard,  1968).  The  case  is  an  international  political  dispute  involving  Taiwan 
Strait.  The  Cross-Strait  relations  were  modeled  as  a  three-person  game,  with 
Taiwan,  China,  and  the  U.S.  as  players.  Preferences  of  these  nation-states  over 
various  outcomes  were  given  as  the  starting  point,  and  equilibria  of  meta-game 
strategies  were  meta-game  outcome  were  derived. 


